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0. A HANDFUL OF POINTERS TO THE ANACONDA SYSTEM 

This introductory survey is intended to (i) make it easier for the reader to un- 
derstand our method for an anal ysis of con cept by means of data-processing 
(ANACONDA) and (2) present some pointers that we hope will give a sufficient- 
ly good picture of the individual steps in the cumulative process of developing 
a theory and a method for studying complex psychological phenomena, such as 
communication by means of symbols. 

We have found when conducting complicated research tasks that the sepa- 
rate reports contribute little to an understanding of the entire problem field. Eac' 
individual piece of work (report) is too often limited by the phenomenon being 
studied, the empirical or experimental situation involved and data being collec- 
ted. In addition the writer of the report cannot always communicate to the read- 
er exactly where the individual reports fit into a series of publications, part- 
ly due to the difficulty of knowing who the reader is and what prior knowledge 
he has of the scientific problem with which a specific report is concerned. 

The presentation that follows gives the background and framework of the 
step described in this report. 

PROBLEMS 



The development of ideas, formulation and solution of problems are behaviours 
that are closely connected with man's specific ability to express himself ver- 
bally. Attempts at exploring and studying not only problem- solving behaviours 
but also the complex psychological process behind such behaviours are not con- 
fined to the field of psychology. For the past twenty years or so such experi- 
ments have also been carried out within several other branches of science, such 
as mathematics, artificial intelligence, information processing and quantitative 
linguistics. The common factor of the research work done within the various 
disciplines is the goal of investigating and giving form to invisible symbolic pro- 
cesses and mechanisms. Instead of using simple mathematical formula to try 
to describe such cognitive operations, researchers working in the fields named 
above are trying to develop computer programmes to describe them and to test 
models and theories on what complex psychological structures look like and how 
processes develop. 

CONTENT AND CONTEXT ORIENTED THEORIES 



These attempts at developing a theory on the content of the messages by means 
of which people communicate with each other aim at increasing our understand- 



ing of the cognitive structures that are assumed to form the basis for a human 
being's verbal expressions. By developing our assumptions step by step and 
continuously testing them, we try to determine their constancy. The question 
we have asked ourselves is the following: 

CAN WE BY MEANS OF NUMERICAL ANALYSIS AND QUANTITATIVE 
DESCRIPTION IDENTIFY AND CATEGORIZE COGNITIVE STRUCTURES 
IN VERBAL DATA, SUCH AS INTERVIEW TEXTS? 

Considering the scientific debate of recent years on process research and the 
marked limitations of various assessment schedules as datagathering methods, 
the successful execution of the task we have set ourselves should make a signi- 
ficant contribution to the research methods that are available to social scientists 

GUIDE TO METHOD DEVELOPMENT 



Some general facts about the 
method and model 



Basic material 



Impressionistic analysis 



It is typical for written or spoken text that 
it is of great complexity and that the type 
of information to be extracted from the ma- 
terial is seldom or never collected in one 
single place in the text. If structural rela- 
tionships are to emerge all the same, the 
text must be prepared on the basis of cer- 
tain assumptions. In Bierschenk & Bier- 
schenk (19? 6) a flow chart is presented sta- 
ting the individual steps in the development 
of the method for computer-based content 
analysis that we suggest. In addition the 
psycho-linguistic model on which our text 
analysis is based is also presented. 

The different phases are presented below 
in chronological order. 

We have restricted our empirical material 
to apply for the time being to "Information- 
seeking, documentation and research plan- 
ning for the R&D work of the school". The 
design and implementation of an interview 
study around this theme is described in B. 
Bierschenk (1974). In this report the as- 
sessment scales with which the interviewee 
were confronted during the interview are al 
so evaluated. 

One way of evaluating interview texts is to 
use an impressionistic content analysis. 
This is based on intuition, insight and im- 
pressions, which means that the interpreta 
tion is based on subjectively found analysis 
results. Such an analysis is to be found in 
Annerblom '1974). 



Computer-based analysis 



Construction of a system 
of rules 



Reliability of manual coding 



Representation of manifest 
language structures 



Theoretical and psychome- 
tric problems 



A model for searching for information in 
interview texts, a brief description of pre- 
liminary coding rules and some empirical 
results from the testing of manual alloca- 
tion of codes on interview text is presen- 
ted in B. Bierschenk (1976). 

The chances of ready-developed methods 
being applicable are often linked to the ap- 
pearance of the material. The attempts at 
analysis that have been described in the li- 
terature and that are of interest to our anal- 
ysis have been developed with written text 
as a basis. But since our material is spo- 
ken language text (transcribed from recor- 
ding tape), which when uttered was meant 
for the ears of the interviewer alone, it be- 
came necessary to build up our own system 
of rules and codes. A preliminary manual 
and some test results are presented in I. 
Bierschenk (1974). 

We use the concept "computer-based con- 
tent analysis" in order to make it plain that 
we do not intend to develop a method for 
automatic text analysis. At the same time 
this means that the basic material must 
first be coded before mechanical proces- 
sings of various kinds can be carried out. 
The success with which two independent co- 
ders have been able to apply different co- 
ding rules in a similar way is described in 
Berg (1974). 

Computer-based content analysis is beco- 
ming increasingly used and usable interna- 
tionally. The demand for programmes and 
techniques that are adapted to various prob- 
lem areas is growing. The theory of linguis- 
tic representation that we have found most 
interesting is Schank's "Conceptual Depen- 
dency Theory". We came into contact with 
it after our first coding rules had been wor- 
ked out and we found that it is in line with 
our way of treating the text for input into 
the computer. The way in which the feeding 
takes place, how identification is specified 
in the coding and the way in which we build 
up our lexicon base is described in I. Bier- 
schenk (1975). 

It is very difficult to try to map what is real- 
Iv meant in the research literature by a con- 
tent analysis. Each content analysis tech- 
nique is namely based on a specific way of 
regarding the content m a message. A con- 
tent analysis presupposes that we can define 
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Paradij2[m_ of the_analysis_ 



that which is to be measured and counted 
in the analysis. The unit of the analysis in 
our psycho-linguistic model goes back to 
the well-known paradigm Agent -action - 
Object (goal). The components are defined 
briefly below. 

Agent Centres of action or goal-seeking en- 
tities that make use of resources 
to reach their goals. This definition 
also includes groups, organizations 
or abstractions that fill the function 
of being an agent. 

■A ction A direct action that is carried out 

by an agent for the purpose of achie- 
ving a goal. The action defines the 
content of the AaO paradigm. 

Object Everything that an action can be di- 
rected at or implemented with. 

By means of the AaO paradigm, the compo- 
nents that form a natural context, i. e. an 
observable sentence, are isolated. By this 
is meant the fundamental form for a state- 
ment that is expressed by the noun. -verb- 
noun 2 relationship. 

While agent and object (noun) are specified 
by attribute (adjectival phrases), the verb 
states the relation between the nouns, i. e. 
actions, events or state. The order of these 
basic elements is stated by means of syntax. 
By using a dictionary and a system of rules 
we hope to be able to construct theories and 
models that can be used to describe events. 

By means of the linguistic elements to be 
found in the flow chart in Figure 1 , we shall 
show both how we build up concepts in a gi- 
ven context and the way in which we intend 
to describe the text numerically and make 
quantitative analyses. The model in Figure 
1 contains three different geometric forms. 
They have the following meaning: 

1. Rectangles which symbolize main con- 
cepts which have a code number ending 
with 0. 

2. Rectangles with dotted lines, which sym- 
bolize qualifiers or specifiers of various 
kinds. They have a final number other 
than 0. 

3. A rhomb, which states choice and deci- 
sion. Very briefly, this means a selec- 
tion of moods of expression. 

The way in which we make use of these pos- 
sibilities is described in B. Bierschenk 
(1976). 
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Source of information 


21 


Interrogation 


12 


Negation 


22 


Supposition 


13 


Tense 


23 


Volition 


14 


Mood 


30 


Agent 


15 


Condition 


31 


Qualification 


16 


Cause 


32 


Description 


17 


Concession 


33 


Classification 


18 


Intention 


40 


Event/state 


19 


Disjunction 


41 


Copula 


20 


Comparison 


42 


Clause modifier 


Fij 


[ure 1. Flow-chart for 


an analysis of text and 



4 3 Time 

44 Place 

45 Modifier of event/state 

46 Circumstance 

50 Object 

51 Qualification 

52 Description 

5 3 Classification 

60 Direction/localization 

61 Qualification 



62 Description 

63 Classification 

70 Recipient 

71 Qualification 

72 Description 

7 3 Classification 

80 Instrument 

81 Qualification 

82 Description 

8 3 Classification 



NUMERIC DESCRIPTION AND QUANTITATIVE ANALYSES 



The building up of concepts in a given context presupposes a system of rules 
that states how different elements are to be linked together and in which order 
this should take place. We assume, for example, that the relations that exist 
between nouns and adjevtives and between nouns and verbs reflect such relations 
as link empirical phenomena with each other. If, for example, we want to state 
that a set of nouns is modified by a set of adjectives we can formalize this rela- 
tionship. 

Further we assume that nouns get their empirically specified content through 
adjectives and/or verbs to which they are linked. By scaling adjectives and verbs 
we can acquire numeric descriptions and quantitative analyses of text. When we 
have in this way observed similarities or co-variations and defined different pro- 
perties, we can carry out multivariate analyses in order to determine the posi- 
tion of a particular property in a number of latent dimensions. 

Now we will proceed to present individual analyses on the basis of the code 
structure stated in Figure 1. 



Codes 32, 52, 72, 
82 and'40 



Since in our analysis we take into conside- 
ration "syntactic behaviour" and regard 
both adjective and verb as descriptive con- 
cepts, we have decided to scale them. An 
adjective describes a noun directly, while 
the verb more indirectly has the same func- 
tion. 

The scaling was carried out by means of 
seven-point assessment scales, the bi-po- 
lar end-points of which are described as 
pairs of adjectives (1) positive -negative, 
■2) active -passive and f 3) strong-weak. A 
detailed account of this is to be found in B. 
Bierschenk '1976) and in Bierschenk & 
Bierschenk (1976). 
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1. CONSTRUCTION OF THEORIES AND MODELS FOR R E SEARCH PRO- 
CESSES 

The initial phase of the research process, i, e, the formulation and definition of 
problems, has been studied empirically. The development of ideas and the for- 
mulation of problems are "behaviours" that are closely connected with a human 
being's specific ability to express himself verbally. As with all other kinds of 
raw data, the analytical problem in the use of spoken or written text is that the 
researcher must infer specific events, behaviours or properties that are asso- 
ciated with the "measuring object" of the investigation. Thus the researcher's 
"messages" concerning the processes of problem perception and problem for- 
mulation form the basis of this study. 

Treatment of the researchers' verbal statements according to ANACONDA 
involves an analysis and synthesis of both empirical statements and the relation- 
ships between them. This means that we cannot be content with a traditional lexi- 
co-graphic listing of words as a base for an approximation of the interview per- 
son 's implicit models for the research process, but must also take into conside- 
ration syntactic order and context. The explicit coding of the basic material that 
ANACONDA involves means that such material forms a foundation for an itera- 
tive construction of theories and models. The flow chart in Figure 1 shows that 
by starting from elements containing linguistic information, we can build up dif- 
ferent analysis units. This presupposes a system of rules that states how diffe- 
rent elements are to be linked together ■ see Bierschenk & Bierschenk, 1976, 
pp. 90-93), But at the same time it also means that we must be able to handle 
large amounts of data so that meaningful statistical descriptions and analyses be- 
come possible. 

The first part of the analysis procedure encompasses steps 1 and 2 (see p. 15) 
and is carried out in order to study whether and to what extent the researchers 
make use of the same words when they formulate and define their problems. The 
frequency distribution shows that we should be able to test an application of clus- 
ter analysis models to condense agents that have been used by four or more re- 
searchers. 

The second part of the analysis procedure encompasses steps 3 and 4 and is 
carried out in order to study whether and to what extent the same objects are 
used. The frequency distribution shows that the objects that have been used by 
four or more researchers can also be condensed by means of some kind of clus- 
ter analysis model. 

In the third part of the analysis, which encompasses steps 5, 6 and 7, seve- 
ral different cluster analysis models and different amalgamation criteria are 
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applied when the clusters are formed. But despite relatively good agreement 
between the results of the different cluster analyses, it seems as if the cluster 
analyses insufficiently represent, the natural groups in the material. 

It is in this phase of the analysis process that the research can make use of 
ANACONDA to initiate an iterative construction of theories and models. Accor- 
ding to classical content analysis techniques, we should probably have been con- 
tent with an explanation of the relations that had emerged and tried to explain 
them according to the theory of simple associations. The patterns that the re- 
sult analyses show appear namely to be intuitively meaningful. On the basis of 
what we "know" about the empirical conditions, however, we cannot allow our- 
selves to be satisfied with this result. 

In the fourth part of the analysis, which encompasses steps 8, 9, 10, 11 and 
12, a study is made of the grouping of the agents and the objects when we only 
take into consideration how often a particular word occurs. The frequency dist- 
ribution shows that five or more occurrences of the same word is a suitable 
lower limit. 

In the fifth part of the analysis, which encompasses step 13, both the agents 
and the objects are clustered. These analyses lead to structures that are com- 
pletely different from those that emerged earlier, partly because more elements 
are included in the analysis and partly because other criteria have been used in 
the determination of the elements. At the same time the clusters give the im- 
pression of better representing the structure of the agents and objects. But this 
analysis is again nothing other than a simple expression of similarities (based 
on the association theory) which are assumed to exist between the units. 

Not until the sixth part of the analysis, encompassing step 14, do we ex- 
ploit the logical relations that exist between the agents and the objects within 
the iramework of the AaO paradigm. By using the verb to give the clause its 
empirically specified content, we make use of the AaO paradigm in order to re- 
flect the relations that link different empirical phenomena. 

In the seventh part of the analysis, encompassing step 15, we make use of 
the quantitative qualification of the verbs (see Bierschenk & Bierschenk, 1976, 
pp. 73-89) to give the objects an empirically specified content. By linking the ob- 
jects to their respective verbs- the objects are functional! zed, i.e. they are 
transformed into empirically specified concepts (see Bierschenk & Bierschenk, 
1976, pp. 35-37). 

Finally we should perhaps explicitly point out here that we have not made 
use of all the nuances and variations permitted by the model, but have restricted 
ourselves to a fundamental structure. It is on the basis of this fundamental struc- 
ture that we study the dimensionality of the empirically specified concepts by 
means of a multivariate analysis model. 
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NUMERIC DESCRIPTION AND MULTIVARIATE ANALYSIS OF INTERVIEW 
TEXTS 



The interview method was chosen as the investigatory strategy on the basis of 
the assumption that the researcher s opportunity to give free. and uncommitted 
answers would provide information of a high validity, at least as long as we can 
assume that the person interviewed is willing to take part in the investigation. 
This choice is also motivated by the fact that this method should be more sensi- 
tive than a questionnaire with fixed alternative answers, since the interviewees 
can define their own statements in as differentiated a way as they wish. 

In the ANACONDA model different linguistic elements form the building 
blocks for an empirically specified concept. The dependencies that exist between 
nouns and adjectives and the relations that exist between nouns and verbs are 
assumed to reflect the relations that link empirical phenomena. 

The first measure taken in building up a concept with an empirical anchorage 
has been to scale all adjectives and verbs by means of semantic differentials. The 
present report is a direct continuation of this work. Our purpose is to study the 
dimensionality of interview texts. In this report we shall describe how we have 

1. built up different registers containing linguistic elements 

2. used cluster analysis techniques to describe manifest relation patterns 
based on different registers and 

3. used a discriminant analysis technique to describe latent relation patterns. 

The analysis results that will be presented are based on the verbal statements 
of forty researchers, chosen at random from the researcher population 'see 
B. Bierschenk, 1974, pp. 32-44). The entire text material consists of approxi- 
mately 4 000 pages of text, but the analysis results are based on only 10% of 
this, i.e.. about 400 pages of text. These 400 pages refer to the answers that 
the researchers gave to four questions (nos. 5, 6, 7 and 8), all of which con- 
cern information and documentation problems. These questions have the follo- 
wing wording: 

5. In which way have you tried to gain more detailed complementary know- 
ledge ? 

6. How consistently have you during the formulation process made use of 
channels of information such as libraries etc? 

7. What type of information have you searched for and which search strategy 
have you used most? 

8. Could you say anything about how one should design information searching 
in order to create ideal conditions for the research process? Have you any 
suggestions for improvements? 
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The compilation of lists containing words used by the interview persons when 
answering questions 5-8 guarantees that our analysis becomes very closely rela- 
ted to the researchers ' own language. 



2. 1 Condensation of registers 

Registers can be defined as a formalized arrangement or description of elements 
(words or speech). If a computer and suitable "Key-Words-In-Context" progra- 
mme (KWIC) are used, we can build up registers that are very closely related to 

the text concerned in the analysis. But this is not a sufficient foundation for con- 
tent analytical processing. We must also be able to use such relations as exist 
within and between clauses and we must be able to construct, categories. A cate- 
gory is usually defined as a number of common attributes. Definition -wise a set 
of properties is decided upon that are both necessary and sufficient to make it 
possible to allocate a linguistic element or element complex to a particular cate- 
gory. 

If a system of categories is defined on the basis of empirical considerations, 
it becomes necessary to determine closeness, kinship or similarity f'this can al- 
so be expressed as distance) for a set of elements. 

The interview texts for questions 5-8 have been coded in their entirety in 
accordance with the AaO paradigm. The scaling of the researchers' actions as 
registered in verb codes has already been described and in this report the rela- 
tion patterns between "agents" and between "objects" will be studied. The term 
agent is used here with the implication of centre of action while the term object 
stands for means or goal that is the object of an action. 

The pre -studies of the appearance of the interview texts have shown that we 
would get very hollow data matrices. By using different cluster analysis models, 
we hope to be able to condense both agents and objects, so that the matrices be- 
come more complete, i.e. a marking for occurrence in each individual cell. 

If cluster analysis techniques are used, it becomes possible to determine 
agent and object clusters objectively. If the agent clusters are then defined as 
the measuring objects of the analysis and the concept clusters ''determined quan- 
titatively by means of verb links) as the variables of the analysis, statistical 
analyses become possible. In addition we can draw definable and exact compa- 
rable boundaries between all the clusters ^categories). This comparison is, how- 
ever, limited to a single investigation. If a criterion value has been established, 
the manifest structure within a cluster arrangement becomes completely depen- 
dent on the value of the coefficients and is thus no longer exposed to the manipu- 
lation of the researcher 'see Sokal & Sneath, 1. 963, pp. 49-53). 

Each fusion criterion presupposes certain given mathematical assumptions 
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and these vary in different cluster analysis models. Since on the one hand we 
apply several different models and criteria, and on the other have information 
from analyses carried out with other analysis models, we can decide whether 
the. analysis result is meaningful or if it is an artificial structure. Such infor- 
mation is available in Annerblom (1974) and in B. Bierschenk (1974). Cluster 
analysis techniques can also be used to condense results from cluster analyses. 

Before describing the approach used in applying different cluster analysis 
models for condensing the information contained in the interview texts, we shall 
give a numeric description of the appearance of the interview texts. The coding 
of the 400 pages of text that refer to questions 5-8 has resulted in 32 445 units 
(IBM cards). By defining the AaO paradigm as distinct units, we have been able 
to build up 

1. a noun register that includes agents (A) and objects (O). The register con- 
tains 1 634 elements 

2. a verb register that contains 1 607 elements and 

3. an adjective register that contains 586 elements. 

In addition we have constructed 

4. a register for noun endings that contains 77 elements 

5. a register for verb endings that contains 37 elements and 

6. a register for adjective endings that contains 41 elements. 

We started a more detailed description of the apperance of the material by stu- 
dying how the interviewees 1-4 have answered questions 5-8 in the interview. 
The results of this study are given in Table 1. 

Table 1- Numerical description of interview text 
from interview persons 1-4. 



Paradigm 


Number of 
cases 


% 


AaO 


864 


100.00 


aO 


86 


9.95 


Aa 


284 


34. 08 


A O 


135 


16. 20 


A* 


87 


10.44 


o* 


190 


22. 80 



Registration at first occurrence only. 



As can be seen from Table 1, about 60% of the AaO paradigms are incomplete 
in one way or another. Thus the analysis should begin with a systematic des- 
cription of each step in the AaO paradigm. If the agents are defined as the mea- 
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suring objects of the analysis and the objects as the variables of the analysis, 
we construct a data matrix with 16 530 cells. An empirical control of the num- 
ber of cells showed that only 342 or 2, 27% of the cells contain one or more 
markings. With this result as a starting point, it became obvious that the ma- 
terial must be condensed and considerable homogenized. For this purpose the 
following analysis programme was carried out. 

Step 1. An analysis of how many agents the 40 interview persons have produ- 
ced in four questions. 

Step 2. An analysis of how the agents are distributed over the interview per- 
sons. 

Step 3. An analysis of how many objects the 40 interview persons have produ- 
ced in four questions. 

Step 4. An analysis of how the objects are distributed over the interview per- 
sons. 

Step 5. A cluster analysis of the agents: (1) BMD P01M, in which the agents 
are treated as variables and (2) BMD P02M, in which the agents are 
treated as measuring objects. 

Step 6. A cluster analysis of the objects: '1) BMD P01M, in which the objects 
are treated as variables and ! 2) BMD P02M, in which the agents are 
treated as measuring objects. 

Step 7. A cluster analysis of blocks: (1) BMD P03M, in which agents and ob- 
jects respectively form one block and the interviewees another and 
where the threshold value fstep lenghts) is placed at . 20 and (2) the 
threshold value at . 10 respectively. 

Step 8. An analysis of the agents' coincidences with objects, where the coin- 
cidences are determined through verbs. 

Step 9. An analysis of the number of agents that occur at least five times in 
the interview material. 

Step 10. An analysis of the number of objects that occur at least five times in 
the interview material. 

Step 11. An analysis of the distribution of the number of agents that coincide 
with such objects as occur at least five times. 

Step 12. An analysis of distribution of the number of objects that coincide with 
such agents as occur at least five times. 

Step 13. A cluster analysis of agents and objects respectively: (i) BMD P01M, 
in which agents and objects respectively are treated as variables. 

Step 14. An analysis of the agent clusters' coincidences with the object clusters, 
where the coincidences are determined by verbs. 

Step 15. A discriminant analysis of reduced coincidence matrices. 

Since statistical analyses and consequently also cluster analyses assume 
complete data matrices, the AaO paradigm was studied more closely. Of 3 458 
subject-object combinations, only 166 or 4. 8% are used by four or more inter- 
view persons. Four researchers or 10% was decided upon as a lower limit for 
a numeric description, since the intention is to be able eventually to carry out 
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different studies. One way of preparing the interview material so that cluster 
analyses can be carried out is to standardize the material, so that only the 
first occurrence of a particular interview person is marked, i. e. W f . •% , 
All entrances in a matrix become in this way of equal importance. If two mat- 
rices are to be compared, the entrances can then easily be re scaled e.g. to 
the mean variance 1 . 

An examination of the frequency of use of the agents shows that the analy- 
sis material contains 953 agents. Of these, however, only 90 or 9.4% are used 
by four or more researchers. The examination of the occurrence of the objects 
shows that the analysis material contains 1 122 different objects. Only 126 or 
11. 2% are used by four or more interviewees, however. A data matrix of the 
size 90 x 126 would lead to 1 1 340 cells, most of which would probably be emp- 
ty, and a statistical analysis that would be difficult to carry out. For this rea- 
son the cluster analyses that will now be described have been made. 

2. 2 Description of manifest relation patte rns 

When different cluster models are used to describe relation patterns, the re- 
searcher's aim is to homogenize and condense large sets of data. In our case 
this results in a hierarchical classification. Further classifications can be 
created by the same technique being used, although with different criteria or 
also by the use of a different but comparable technique. The clustering results 
can then be compared. The assessment of whether or not the results obtained 
are meaningful or not can only be subjective, however. 

2. 2. 1 Clustering of agents 

If the interview persons are treated as measuring objects and the agents as va- 
riables, we can construct similarity matrices (or distance matrices) and de- 
fine the distance between two rows in the matrix. The row values are calcula- 
ted for all the variables. It is assumed that none of the row scores is missing. 
Most techniques are based on the use of similarity measurements. The values of 
the similarity coefficients can vary between 1, 00 (perfect agreement) and ^no 
agreement at all). One of the cluster analysis programmes that we have used 
is BMD P01M; the programme is described in Biomedical Computer Programs 
(Dixon, 1975). In this programme the variables are grouped according to simi- 
larity. The measurement used for the association between pairs of variables 
is "Euclidean distance". This distance is defined as "the square root of the sum 
of the squared differences between the values for pairs of variables". The sum- 
mation takes place over all 40 interview persons. In this way a cluster is for- 
med that contains the variables that are most like each other. An amalgamation 
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algorithm then determines which two clusters are most alike. In this analysis 
we have made use of "the average linkage algorithm". According to this rule, 
the mean distance is calculated between a variable in the first cluster and a 
variable in the second cluster. 

The similarity matrices can then be used in several different ways. The 
cluster analysis technique that is based on Sokal and Sneath's method 'BMD 
P01M) transforms the similarity coefficients to product-moment correlations. 
The relations that exist between the similarity values and product-moment cor- 
relations are given in Table 2. 



Table 2. Transformation of similarity values 
to product -moment correlations. 



Similarity 


Product -moment 


values 


correlations 


50.00 


. 00 


55. 00 


. 10 


60.00 


. 20 


65.00 


. 30 


70.00 


.40 


75.00 


. 50 


80.00 


.60 


85.00 


.70 


90.00 


. 80 


95.00 


.90 


100.00 


1.00 

1 



The correlations shown in Table 2 provide a gauge of the hierarchical classifi- 
cation's adaption to the original similarity matrix (see Anderberg, 1973, pp. 
203-204). At the beginning of the analysis each individual agent is regarded as 
a "cluster". If two clusters fulfil a closeness or distance criterion, they are 
placed together to form one. The process then proceeds in this way through 
the whole material. The lowest limit used to merge variables in cluster forma- 
tion is r ">. 30. The result of the agent clustering is summarized in B. Bier- 
schenk, 1976, Appendix 1:1. With the knowledge we have of the interview texts, 
the clusters appear to be meaningful from the point of view of interpretation. 
The fact that relatively many "one-variable clusters" exist indicates heteroge- 
neity in the agents. This can possibly be because we have not in this analysis 
differentiated between auxiliary verbs 'code 41) and verbs 'code 40). The con- 
struction with an auxiliary verb has a copulative function, i. e. the verb 'pri- 
marily Is' ) binds the agent code with a description or classification, where no 
action has been described. In this sense it is a question of a qualification. Since 
this clustering result is only a kind of pre -study, the results of the agent 
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clustering will not be reproduced in their entirety, but are available at the De- 
partment of Educational and Psychological Research in Malm5. In this phase of 
the analysis process it is also too early to give the clusters general headings. 

2. 2. 2 Clustering of objects 

The clustering of objects in which objects are treated as variables, is based on 
129 objects. The result of the object clustering is summarized in B. Bier- 
schenk, 1976, Appendix 1:2. In the evaluation of the clustering results the same 
amalgamation criterion has been used as in the evaluation of the agent clus- 
tering. The object clustering shows a considerable reduction in the number of 
variables, namely from 129 to 48. The cluster structure appears to be meaning- 
ful. A sign of the greater homogeneity in the object clusters is that there are 
several clusters containing three or more variables than is the case in the agent 
clusters. In addition there are fewer "one -variable clusters". This greater ho- 
mogeneity can possibly be explained by the selective function of the verbs. It 
is still too early, however, to give the clusters general headings, so no attempt 
has been made to find any. 

2. 2. 3 Clustering of blocks 

Hartigan's (1972, pp. 123-129) suggestion for a "direct clustering of a data mat- 
rix" implies a cluster analysis model that differs in several respects from the 
BMD P01M and P02M programmes. The analysis model (BMD P03M) is more 
complex, since both measuring objects and variables are clustered simulta- 
neously. Hartigan (1972, p. 123) writes: 

"The principal advantage in this approach is the direct interpretation of the 
clusters on the data. " 

Another essential difference between the block clustering technique and the one 
already described is that the block clustering technique is based on the fact that 
a comparison of the similarity matrices takes place by means of a calculation 
of distance, instead of correlations. This distance is represented as a weighed 
Euclidean distance. The cluster analysis method is based on an attempt to mi- 
nimize the distance between the matrices, in which the one matrix is the origi- 
nal similarity matrix and the other is a similarity matrix for clusters. Thus 
what happens is that the differences between the group mean values are tested 
and the clusters that on a particular level prove to be most like each other with 
regard to their respective mean vectors or centroids are merged. One distin- 
guishing factor in this method is that the similarity values for the linkages in 
the most similar clusters can vary (rise and fall) from step to step. Thus it can 
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happen, for example, that the distance (when the similarity measurement is an 
expression of distance) between the centroids of certain pairs is less than be- 
tween another pair that has been combined at an earlier stage. As a result the 
last linkages take place on a lower level than in the two preceding cases (see 
Anderberg, 1973, p. 141). 

The block cluster analysis was carried out with agents as variables and in- 
terviewees as measuring objects. Two different threshold values were used, 
namely step length . 10 and step length , 20. But for the sake of completeness 
all the other step lengths are also given. Together with the two described clus- 
ter analyses, this study forms a pilot-study and for this reason we shall not give 

a more detailed account of the cluster structure but only give a summarized 
account of the results in B. Bierschenk, 1976, Appendix 1:3. As can be seen in 
Appendix 1:3, the clustering structures are rather like each other. In addition 
it emerges clearly which clusters the larger clusters break down into when the 
limit value is selected more restrictively. In this way the interpretation is also 
made easier. The most marked difference between the two clustering results 
given is that the threshold value . 20 leads to fewer clusters than is the case 
when the threshold value . 10 is used as a criterion of division. In the first ana- 
lysis 38 clusters have been formed while the number of clusters in the second 
analysis is 24. Moreover, in both analyses it appears as if the more abstract 
agents are clustered and linked together at a relatively late stage in the analy- 
sis procedure. 

In the next chapter we shall attempt to summarize the results of the agent 
clustering by comparing the results of the different analysis techniques. 

2.2.4 A comp arison of the resu l ts of two cluster analysis methods 

In the introduction it was mentioned that different mathematical assumptions and 
different criteria of division can have as a result that the use of different clus- 
ter analysis techniques leads to different cluster structures. Our purpose in 
this comparison is to study to what extent there is a kernel' of clusters or "na- 
tural groups" or at least individual "agents" that remain stable irrespective of 
which analysis model is used. Thus the purpose is to sort the agents into groups 
so that the degree of natural associations is high between the agents that have 
been placed in the same group and low between the members of different groups. 
In B. Bierschenk, 1976, Appendix 1:4 - 1:5 an account is given for the purpose 
of comparison of a reorganization of the cluster results from Appendices 1:1, 
1:2 and 1:3. 

A comparison between the agent clusters resulting from Sokal and Sneath' 
and Hartigan 's methods respectively shows that there are many clusters that 
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are very like each other. But the analysis result also shows that the "I" refe- 
rence forms a cluster of its own with comparatively high frequencies. This 
gave rise to a division of the 1-reference in as many distinct elements as there 
were interviewees. In addition the analysis results suggest that we should not 
standardize the agents with regard to the interviewees but study the occurence 
of the agents in the text irrespective of how many interviewees have made use 
of the same agent. In this way it becomes easier for "natural" agent clusters 
appearing in the text to be formed. 



2.2.5 Clustering of agents & objects 

The cluster analysis results presented hitherto are all based on a clustering of 
agents and objects respectively without taking into consideration the agents' 
coincidences with the objects and the reverse. Which agents that coincide with 
which objects and how often is determined by verb linkages. But our analysis 
technique is based on the very assumption that the whole AaO paradigm is of 
importance in an analysis and synthesis of empirical phenomena. In a later 
stage of the analysis programme (steps 8-13} we have studied the occurrence 
and distribution of the agents that coincide with the objects when the objects are 
related to the agents via verbs and the reverse. This analysis shows that 888 
agents coincide with the objects, while 960 objects coincide with the agents. An 
analysis of the agents ' and the objects ' frequency distribution shows that a limit 
can be drawn at frequency 5, i. e. the agents and objects that occur at least 5 
times in the entire material ■Irrespective of the interviewee) should be included 
in the continued analysis. A new analysis of the distribution of agents and objects 
with (N ~> 5) shows that there are 222 agents and 192 objects. In order to be able 
to construct a similarity matrix from a data matrix of the order 222 x 192, it 
became necessary to expand existing computer programmes. Then a cluster ana- 
lysis was carried out by means of Sokal and Sneath 's cluster model both for the 
agents as variables and for the objects as variables. The cluster structures are 
described in B. Bierschenk, 1976, Appendices 2 and 3. A summary of 
cluster structure is given there in, Appendix 2:5 - 2:6 and the cluster structure 
of the objects is summarized in the same -publication, Appendix 3:5 - 3:6. As in 
the earlier analyses, correlations >. 30 has been used as the criterion for amal- 
gamation. A comparison between the clusters described in these Appendices, 
Boxes 1 and 2, and the clusters that emerge from Boxes 5 and 6 show that the 
cluster structures have changed completely, 

terion values, we can use the cluster analyses to discover quite different struc- 
tures in our data. The analysis results make it quite plain that if the analysis 



21 



procedure is based on the separate parts of the AaO paradigm, we get quite dif- 
ferent results than is the case if we take into consideration the relations within 
the AaO paradigm. 

The cluster structure of the agents shows that 22? agents can be condensed 
considerably, i.e. by more than 50%. This analysis has led to 43 clusters with 
2 or more agents and 57 clusters with only 1 agent. A noticeable difference be- 
tween these agent clusters and those described earlier is that the references to 
"I" or to a particular project dominate. The greater element of agents also en- 
compasses concepts such as project, institution, library etc. These appear to 
function as comprehensive terms for persons who can act. Another typical fea- 
ture in the agent clusters is all the terms for fields of work or functions of va- 
rious people, e.g. fellow-workers, Ph.D. students, project leaders or archi- 
vists. No interpretation will be made now, however. 

The cluster structure of the objects shows that 192 objects can be condensed 
by more than 50%. The object structure contains 53 clusters with 2 or more ob- 
jects and 31 clusters with only 1 object. The situation is almost the reverse of 
the agent structure. This indicates that there are greater similarities between 
the objects. The summary of the agent structure appears to express an action 
or contact for the purpose of obtaining information ''names of libraries and refe- 
rence organs are to be found, as are reference groups). The summary of the 
object structure expresses more the actual research work and means of app- 
roach, i. e. concepts for the problem area concerned and the abstract words 
such as problem, programme, report or thesis. From the point of view of in- 
terpretation, the clusters appear meaningful, but it is still too early to try to 
give each individual group a comprehensive heading. 

The next step fstep 14) in the analysis procedure is an examination of the co- 
incidences of the agent and object clusters. This matrix consists of 64 agents 
and 53 objects. 64 agent clusters were chosen since, in addition to 43 clusters 
with 2 or more agents, there are a further 21 clusters with an agent that has a 
frequency as great as or greater than the lowest frequency in a cluster with 2 
or more agents. The resultant coincidence matrix was examined with respect to 
the number of empty cells. Out of 3 392 cells, there are only 646 or 16% that 
contain one or more markings. A matrix with such an appearance is hardly sui- 
table for e. g. correlative studies. But since this analysis technique can also be 
used to condense results from different cluster analyses it is naturally also pos- 
sible to compress the material further. The result of an iterative assimilation 
by means of the applied cluster analysis techniques is being worked on at pre- 
sent. But in this account we shall for the moment apply a simpler approach, i. e. 
all agents and objects with coincidences below 10% are ignored. The result of 
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this selection was a matrix with 16 agent clusters and 9 object clusters. Since 
this matrix still contains 34% of the cells with no marking, the matrix has been 
reduced to a matrix with a size of 14 x 6. By this reduction the number of cells 
with no observation is reduced to 19%. This matrix provides the necessary con- 
dition (that there should not be too many mean value assessments) for us to be 
able to exploit the dependence that exists within the AaO paradigm in order to 
give the object an empirically specified meaning. By linking the verb assess- 
ments that exist to the objects, each object cluster can get its empirical con- 
tent. In this way they are transformed to "concept" clusters, If the agent clus- 
ters are now cosidered as measuring objects and the concept clusters as variab- 
les, we can regard the measuring objects as three independent groups (see 
Bierschenk & Bierschenk, 1976, pp. 79-89). This type of covariation schedule 
is suited to a discriminant analysis. But before this analysis is presented and 
discussed in more detail, we shall present the agent clusters and try to give 
them a comprehensive denotation. The agent clusters are presented in Box 1 
and the object clusters in Box 2. 

Box 1 . Selected agent clusters 



Agents 








Description 




Cluster 1: 


Social 


zli 


sychologically oriented researchers 


ent in mentally 

aphy 

ildren 's perso- 

wironments. 


ARBASS p 

Fellow -wc 
STUG pro. 
One (33) 
I ^25) 
I (35) 


roject 

irkers 
ect 






Occupational and social adjustm 
retarded adolescents. 

Studies of generation conflicts 
Development of independence 
Individualised teaching in geogr 
Studies of the development of ch 
nalities in varying residential ei 


Cluster 5: 


Pupil - 


oriented 


researchers 




The FRIS 

I (37) 

1(17) 
I (28) 


project 






Free writing in the middle level 

school 

Preschool - primary school in c 

Interrupted studies in the basic 

The consumer project 


of the basic 

:oope ration 
school 


Cluster 8: 


Langu« 


*g 


e -oriented researchers 


of immigrant 


Library 
1(36) 

I (29) 








Models for bilingual instruction 
children 

Assessment of essays 


Cluster 9: 


Science - 


oriented researchers 


teaching, edu- 


I (04) 








Problem-solution, mathematics 
cational planning 
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Box 1 . cont. 



Agents Description 

I (14) Direction of studies in post- secondary school 

(physics) 

Cluster 11: Cognition -psychology-oriented researchers 



I (01) Cognitive development (Piaget) 

I (18) Self -instruction methods in teaching the deaf 

I (09) Educational Achievement 

I (16) Goals of adult education organizations - now and 

in the future 

Cluster 12: Researchers interested in methodological problems 



I (27) Pedagogics in teacher training: problems of con- 

tent analysis 
I (24) Methodological problems in educational research 

I (08) Statistical methodological problems in educatio- 

nal research 

Cluster 13: Researchers interested in programmes for applying influence 



The PUSA project Personality development in backward pupils 

The OM project Overall goals 

I (05) Social-psychological aspects of the compulsory 

in-service training for teachers: programme 
development 
I (32) Teaching methods in higher education 

I (31) Four-year-olds and their parents (parent education) 

I (33) Social development and social training in the ba- 

sic school 

Cluster 14: Linguistically-oriented researchers 



Bierschenk Psycho -linguistics 

One (29) Analysis of linguistic structures 

I (30) Teaching methods in German 

Cluster 18: Humanistically-oriented researchers 



Researchers 

1 (20) Freedom and equality as basic educational con- 

cepts within Western pedagogics 
I (06) Process analysis of non-grading 

Cluster 20: Researchers interested in socialization 



I (03) Studies in the socialization of the school 

I (13) Studies on the internal work of the basic school 

Cluster 25: Dissemination of information 

Literature 
Symposium 
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Box 1. cont. 



Agents 




Description 


Cluster 34: 


Type of information and transference of information 


Book 

Relation 

Norm 

Source 

Convention 

Measuring instrument 

Idea 

Problem formulation 


Reading Books 

Method 

Suggestion 

Computer 

Reference 

Content 

Works 

PA 


Cluster 44: 






We 




researchers, within the project, etc. 
identification with certain groups 


Cluster 46: 






Person 




unspecified 



Three different types of cluster are presented in Box 1, namely such (1) that 
mainly contain agents referring to project names and persons. . , (2) that exclu- 
sively contain agents referring to the dissemination of information, type of in- 
formation and transference of information and (3) that are relatively unspeci- 
fied. As can be seen from Box 1, the first type of cluster has been described 
by means of the problem areas that the researcher in question has mentioned 
as his point of reference for the interview. The other two types of cluster need 
no description over and above what can be read from the clusters themselves. 

Box 2. Selected objects 



Cluster 13: Bibliographical reference 

Literature 
Reference 
Journal 

Cluster 14: Research organization 



Institute 

Project 

Seminar 

Cluster 29: Discussion of problems 



Discussion 
Problem 
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Box 2. cont. 



Cluster 46: Channels of information 

Library 

Symposium 

Person 

Psychological Abstracts 

Reference group 

ERIC 

Department library 

University library 

Handbooks 

Reviewing organs 

Cluster 47: Information on research methods 

Design 

Summary 

Measuring instrument 

Cluster 48: Information for demarcation of concepts 

Document 

Suggestion 

Idea 



The distinguishing feature of the clusters presented in Box 2 is that content-wise 
they do not need to be clarified by any description. If these clusters are com- 
pared to those discussed in Chapter 2.2.1 - 2. 2. 4, it emerges clearly that it is 
only by making use of whole sentences that we can overcome the circumstance 
that such structures as exist in our texts are broken down in an artificial way. 

2. 3 Description of latent patterns of relations 

Following Chapter 2. 2. 5 we have arrived at a covariation schedule that is suited 
to an analysis of more complex relations. By means of a discriminant analysis 
we shall in this chapter study whether and to what extent the six concept clusters 
described in Box 2 can be used to separate the groups as far as possible. (For 
a description of the discriminant analysis model, see Cooley & Lohnes, 1971; 
Tatsuoka, 1971.) 

The use of multivariate techniques usually presupposes many more measu- 
ring objects and variables than is the case in this analysis. The more measuring 
objects that form the foundation for the adaptation of the model to a set of empi- 
rical data, the greater the certainty of the model's goodness of fit and subse- 
quently of the generalizability of the results. 

Another prerequisite that must be fulfilled is that the measuring object con- 
sists of a random sample. This condition cannot be compensated by increasing 
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the number of measuring objects included in the analysis. This latter condition 
must be considered fulfilled since the interviewees consist of a random sample. 

On the other hand 14 measuring objects, 6 variables and 3 groups form a very 

small set of material which restricts our possibility of generalizing from the 

results of the analysis. 

The decision to use a multivariate analysis model is based on the following 

considerations: 

i. The measuring objects consist not of individual agents, but of groups of 
agents that are a result of a statistical process of condensation and ho- 
mogenization. 

As has been seen from the analysis procedure described, several hundred ob- 
servations form the basis for the formation of the clusters that are included in 
the analysis. This provides a more certain empirical base than would have 
been the case with individual agents as measuring objects. The same argument 
applies to the concept cluster included in the analysis. 

2. The assessments on which this is based show a high degree of reliability 
(or max = .86 - .97). 

These values indicate reliable measurements. As a result of the small number 
of measuring objects, however, there remains an element of uncertainty over 
how well the model fits our data. But this argument is only important for signi- 
ficance testing and generalization. If the discriminant analysis model is used 
for purely descriptive puposes, the doubts that have been expressed regarding 
the generalization aspects are of subordinate importance. 

Discriminant analyses can be carried out both by using all the variables 
simultaneously, and by making a step-wise analysis of the relative discrimi- 
nation ability of each separate variable. Both types of analysis have been made. 
The computer programmes used are partly Cooley & Lohnes' (1971) MANOVA 
& DISCRIM, partly the step-wise discriminant analysis programme from SPSS. 
As far as the simultaneous use of the variables is concerned, both programmes 
lead to identical results. A more detailed account of the results of the discrimi- 
nant analysis will be given below against the background of the step-wise, since 
the comparison between the two types of analysis showed that there is one variab- 
le that diminishes the discriminating ability of the discriminant functions. 

When three groups exists, two discriminant functions can be formed. The 
discriminating power of each function is stated by means of the eigen values 
and canonical correlations that are associated to the respective functions. The 
eigen values is a gauge of the total variance that exists in the discriminated 
variables. Table 3 presents the discriminating of the functions. 
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Table 3. Discriminant functions, eigen value (X), relative percentage (%) 
canonical correlations (R) 




As can be seen from Table 3, the first function is much more important for 
the separating than the others. The relationship corresponds to 3:1. The 
canonical correlation forms another association measurement. R states to 
what extent a function is related to the "group variables". 

The statistical tests that are incorporated into the SPSS programme are 
Wilk's lambda (A) and X. . These state the success with which the six concept 
clusters separate our 3 groups when the variables form a discriminant function. 
The statistical tests mentioned are given in Table 4. 

2 

Table 4. Discriminant functions, Wilk's lambda (A), X , df and level of 



significance (p) 








Discriminant 
function A 


x 2 


df 


P 


. 30 

1 .70 

— . .—.- — . . , .,,-—— ,, i — 


44.98 
13.28 


10 
4 


.00 
.01 



Table 4 shows how both functions are significant and consequently of importance 
for a discrimination of the groups. A stands in a reverse relationship (an inverse 
measurement) to the power of discrimination. High values on A mean that there 
is no discriminating information of importance left. 

The two subsequent discriminant functions are presented together with 
their respective coefficients (standardized) in Table 5. 

Table 5. Discriminant functions (f) and standardized coefficients 



Concept -cluster 


h 


f 2 


13. Bibliographic reference 


.50 


.04 


29. Discussion of problems 


.44 


.82 


46. Channels of information 


. 53 


-.71 


47. Information on research 


-. 17 


-.49 


methods 






48. Information on demar- 


.32 


. 33 


cation of concepts 
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The coefficients given in Table 5 are weights and can be interpreted in the 
same way as factor loadings or beta weights in a multiple regression analysis. 
In this sense the coefficients state which clusters contribute most to differentia- 
tion in the respective dimensions. The clusters that are important for the first 
function are "Bibliographic reference" and "Channels of information", which in 
addition show high negative weight in the second function. 

The ones that are important for the second function are "Discussion of 
problems" and with a negative sign "Information on research methods". 
"Information on demarcation of concepts", on the other hand, is equally im- 
portant for both functions. This result is in agreement with the results presented 
in B. Bierschenk (1974, p. 64). 

The concept cluster "Research organization" has no discriminating power, 
but rather a reducing effect when the cluster is combined with the others. If 
we ignore the signs in front of the coefficients, Table 5 shows that clusters 29, 
46 and 48 contribute substantially to both functions. The initial values are given 
in Appendix 1. The two functions described above have been formed in such a 
way as to make the separation between the groups as great as possible. We shall 
take a closer look at how far we have succeeded in separating the groups and 
the extent to which the classification of the individual agent clusters to the 
respective groups has been satisfactory. Table 6 gives a summary of the 
result of the classification. 



Table 6. Summary of classification results 



Group 


No. of 

measuring 

objects 


Group 1 


Group 2 


Group 3 


1 


14 


n 

% 


12 
85.7 


2 
14.3 



.0 


2 


14 


n 
% 


2 

14. 3 


9 
64. 3 


3 
21.4 


3 


14 


n 
% 



.0 


2 
14. 3 


12 
85.7 



Group 1 Evaluation dimension 
Group 2 Activity dimension 
Group 3 Power dimension 

In 78, 57%. of the cases it has been possible to classify the measuring objects 
correctly. By classification is meant here the process of determining the 
probable group a measuring object will belong to when only the measuring 
object's values in the discriminating variables included in the analysis are 
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available. But it should perhaps be pointed out that probability assessments 
of this type presuppose a large number of measuring objects if we are to be 
able with any certainty to make pronouncements on the probability of an event 
happening or not. The assessed probabilities on which this classification is 
based must therefore be considered very uncertain. Bearing this reservation 
in mind, we can see from Table 6 that the efficiency of the variables in sepa- 
rating the measuring objects is best with regard to groups i and 3. The rela- 
tively large proportion that is wrongly classified (35. 7%) means that as far as 
the activity dimension is concerned the variables discriminate badly. Table 6 
also shows that the wrong classifications in the evaluation and power dimensions 
are equally large (14. 3%). 

Another way of studying the classification ability of the discriminant 
functions is to study the groups to which the agent clusters belong. Since we 
have a classification of the agent clusters that has been used for derivation of 
the functions and a comparison between predicted group affiliation, the success 
of the classification can be measured empirically. The gauge for this measure- 
ment is the proportion of correctly classified agent clusters. In Appendix 2:1-2:3 
can be found the predicted group affiliation for each individual measuring object 
and discriminant values. As can be seen from Appendix 2:t, we should, if taking 
primary consideration to our concept clusters, have predicted group affiliation 2, 
i.e. the activity dimension for the agent clusters "Linguistically-oriented 
researchers" and "Dissemination of information". 

The classification ability of the discriminant functions for group 3, i. e. 
the power dimension, is presented in Appendix 2:2. This appendix shows group 
affiliation 2, i.e. the activity dimension is predicted for the agent clusters 
"Social -psychology oriented researchers" and "Cognition -psychology oriented 
researchers". In this "group" there are only 2 incorrect classifications, which 
means that our six concept clusters function well as prediction variables. 
Another distinguishing feature in this table (Appendix 2:2) is that the discriminant 
values for the measuring objects with regard to this dimension are relatively 
similar. 

Finally Appendix 2:3 will show how well the affiliation of the measuring 
objects to group 2, i.e. the activity dimension, could be predicted. As can 
be seen from appendix 2:3 it has been more difficult to predict on the basis 
of the information that is available from the concept clusters the affiliation 
of the agent clusters to group 2, i.e. the activity dimension. In 35.71% of 
the cases the agent clusters were incorrectly classified. For the agent clusters 
"Language -oriented researcher", "Type and transference of information" 
and "We", affiliation to the power dimension is predicted. Problems of 
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classification have arisen regarding "Researchers interested in method 
problems" and "Humanistic -oriented researchers" and whether they belong 
to group 1 or group 2. This greater uncertainty is also reflected in the 
greater variation in the discriminant values. 

Further information on the differences between groups 1-3 can be 
obtained from the groups' centroids (stated in Fig. 2 by * ) and a graphic 
presentation of the position of the measuring objects in a two-dimensional 
discriminant space (group affiliation is stated by the figures 1 , 2 or 3). 

The centroids in Figure 2 state the mean value of the discriminant values 
for each group and respective function. As can be seen from Figure 2, Function 1 
discriminates well between groups 1 and 3. This function is described primari- 
ly by "Bibliographic reference" and "Channels of information". Function 2, 
the positive pole of which is mainly characterized by "Problem discussion" 
and the negative pole by "Channels of information" and "Information on 
research methods", is needed to distinguish as far as possible group 2 from 
groups i and 3. Figure 2 shows that it is easier to differentiate between 
evaluation and power than between evaluation and activity, or power and acti- 
vity. But the figure also shows that both functions have a good separating 
power. This would probably emerge even more clearly if it were not for 
three so-called outliers. The agent cluster "Dissemination of information" 
with the discriminant value (1. 63, — 1. 61) in the evaluation dimension and 
(—1.67, —2. 60) in the activity dimension is the one that deviates markedly 
from the other clusters. The other agent cluster that falls outside its group 
affiliation is "Methodological problems" with the discriminant values (1. 15, 2.00) 
in the activity dimension. 

These circumstances can be studied in more detail by examining the 
linkages of agent clusters with regard to the values presented in Appendix 1. 
The agent cluster "Dissemination of information" that includes the agents 
"Literature" and "Symposium" (see Box 1) will be used to give an example 
of such an examination. If the agent cluster is related to the concept cluster 
"Bibliographic reference" (that encompasses the concepts "Literature" 
"Reference" and "Journal") via the verb linkages, it proves that the actions 
express a weak positive evaluation (m = 4. 59), passivity (m = 3. 21) and weak 

power (m = 3. 32). 

The second cluster that describes the first function is the concept cluster 
"Channels of information" (for definition see Box 2). The verbs that relate the 
agent cluster to this concept cluster express actions that imply a low positive 
evaluation (m = 4. 60) and a certain activity (m = 4. 21) though with weak power 
(m = 3. 76). 
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If this agent cluster's relationship with the concept cluster that defines the 
positive pole of the second function, we can establish that the actions relating 
literature and symposium to "Problem discussion" express a negative evaluation 
(m = 3. 46), passivity (m = 2. 39) and weak power (m = 3. 75). 

Against the background of this result, the conclusion we can arrive at 
is that literature and symposiums do not contribute noticeably to problem 
discussions. Moreover this agrees entirely with our expectation that the 
activities connected with symposiums do not express much dynamic, i.e. 
activity or power. Since this is the case, the actions should not express any 
very great positive evaluation either. 

The negative pole of the second function is defined by the concept cluster 
"Information on Research Methods" (for definition, see Box 2). The verbs 
relating this concept cluster to "Literature" and "Symposium" express actions 
that are somewhat more positive (m = 4. 06) in their evaluation. They are also 
somewhat more active (m = 4. il) and show a marginal increase in power 
(m= 3.88). 

We can draw the following conclusion. The actions that associate dissemi- 
nation of information via literature and symposium with information on research 
methods are rather neutral in all three aspects, namely evaluation, activity 
and power. But it is also plain that information in research methods is to a 
somewhat greater extent sought and disseminated via literature and symposium 
than is the case with such information as is of importance for problem 
discussions. 
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SUMMARY AND DISCUSSION 



ANACONDA implies an attempt to make the method for content analysis more 
objective and more flexible than classical content analyses are. In our case 
to objectify means that originally subjective functions are transferred to 
computers, while flexibility means that ANACONDA is supplied with routines 
that are distinguished by a large capacity for retrieval. The development of 
an analysis method that is characterized by such qualities demands careful, 
reliable and valid analyses. This type of method development can only take 
place step-wise and in interaction with the basic material. 

In this report we have described the outcome of the first step in the 
quantification of the AaO paradigm, in which we have only made use of the 
empirical qualification of the verb. The continued method development involves 
a study of each individual element (in Fig. 1) and its relative contribution of 
information. The most immediate work planned concerns an analysis of the 
importance of the adjectives and a quantification of the adverbs, plus an 
analysis of the relative contributions of the adverbs. 

One demand that must be made on ANACONDA is that the method must 
lead to a valid systematization of verbal statements. An attempt to demonstrate 
the validity of the method by means of the systematization of the researcher's 
answers to questions concerning information and documentation that exist in 
the form of the impressionistic content analysis (see Annerblom, 1974) and 
the evaluation of the assessments on seven -point bipolar assessment scales pre- 
sented in B. Bierschenk (1974, pp. 63-69) will be described now. It should be 
possible to compare the compressed summary given with the results of the 
discriminant analysis and the conclusions presented in that context. 

A thorough and systematic check of research publications is the excep- 
tion rather than the rule. It has also been known for researchers first to 
gather data and then look for suitable literature. Personal contacts are felt 
to be the best source of information, but do not appear to play an important 
part in the researcher's attempts to bring about a supplementary exchange of 
information. The assessments show that the researchers search primarily for 
information that will help them to develop an idea, so that the product will be 
a well-facetted problem, the various facts of which will be suited to a scientific 
attack. Information for the demarcation of concepts appears to be of a quite 
special type, since it is not sought together with other information that is of 
importance for the development of the research strategy. The analysis shows 
a negative relation between this type on the one hand and on the other, opinions 



33 - 



and interpretations, empirical relations and evidence, norms and conventions, 
measuring instruments or methods for the processing of data that is to be col- 
lected. 

The process of problem formulation is highly dependent on the researcher's 
information -searching behaviour and determination to become acquainted with 
the research literature in his own field. The impressionistic analysis of the re- 
searchers ' comments shows that the library is in many cases used because of 
good personal relations with the library staff (N.B. cluster 46 contains person 
as an element). Reference organs such as ERIC, Psychological Abstracts 'PA) 
and others form one group that researchers use. It emerges from the comments 
on the evaluation of reference organs that they "feel dissatisfaction" and that 
they wish for measures to be taken to improve the quality. ERIC, for example, 
has attracted little attention (N. B. cluster 46 contains in addition to various ty- 
pes of library the different reference organs mentioned here). Symposiums are 
attended roughly once a year, but only by certain researchers. Our assessments 
show that information obtained from reference organs and symposiums is evalu- 
ated lowest (with the exception of foreign symposiums). 

The methods used in searching for references to literature are unsystematic 
and employed periodically. Often the researcher starts from references in cur- 
rent literature and searches backwards from these in handbooks, journals and 
articles. The expectations of obtaining information from libraries are low, how- 
ever. In the suggestions for improvements a wish is expressed for a better over- 
all view and help in structuring the enormous flow of information. But it is also 
said that the researchers need not stress their way through masses of literature 
for fear of missing something. Conversations with other researchers, regular 
project meetings and seminars are used, on the other hand, for problem dis- 
cussion and informal literature seminars to stimulate the interest in reading. 
These problem discussions appear to be the main source of ideas and problem 
demarcation, since the researchers primarily try to get bibliographic referen- 
ces via different types of channels of information (symposiums, libraries, refe- 
rence organs, persons and handbooks). 

The researchers do not appear to search for information on research methods 
while the problem discussion is underway or when information for the demarca- 
tion of concepts is sought. This result is also supported by all the critical opi- 
nions on printed information material. Ideas and suggestions do not seem to be 
particularly accessible via this type of information. Nor is information on re- 
search methods that is available in handbooks and works of reference sought to 
any great extent. Instead such information is sought mainly from tutors and fel- 
low-researchers. 



34 - 



If the impression given by this account of results is compared with the re- 
sults given as an example of the outcome of the discriminant analysis together 
with the comments to be found in connection with Table 5, there can be little 
doubt about the agreement between the results, namely: 

1. Bibliographic references are sought via different types of 
channels of information. 

2. The information the researcher tries to get via problem 
discussions is different from that which he searches for 
via channels of information. 

r 

3. The researcher seeks information for demarcation of con- 
cepts mainly via problem discussions. Information on re- 
search methods, on the other hand, is sought neither by the 
use of different channels of information nor through prob- 
lem discussions. 

4. Information for demarcation of concepts forms a particular 
type and seems to be negatively related to information on 
research methods. 

5. Dissemination of information in the form of literature and 
symposiums is related via actions to bibliographical refe- 
rences, channels of information, problem discussion and 
research methods. The evaluations, the activity and the 
power these actions express show neutral to negative atti- 
tudes. 

The problem in connection with an empirical analysis is to choose suitable or 
strategic parts in a set of data. This cannot take place independently of a relati- 
vely explicitly described initial model or theory, however. 

The theory on the research process that has guided the collection of the in- 
terview material has been described in detail in B. Bierschenk M974) and the 
theory of the underlying cognition processes that guided the empirical analysis 
are discussed at length in Bierschenk & Bierschenk (1976). We shall now illus- 
trate the way in which our empirical results can be introduced into the initial 
models. 

We should perhaps emphasize here that the basic material for this analysis 
has been limited to the researchers' answers to questions concerning informa- 
tion and documentation. The arguments for this selection have been put forward 
in several reports but will be repated here. This material has been chosen for 
the purpose of studying the information-seeking strategies of researchers. But 
the material was also chosen in the hope that the analysis results would prove 
to be intuitively meaningful, since the questions on information and documenta- 
tion are concrete. 

An analysis of cognitive processes presupposes that they can be represented, 
i.e. be made manifest. On the manifest level in our psycholinguistic process 
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model, we are studying quantitatively linguistic elements and syntax. The re- 
sults in this report concerning the manifest level consist of numeric descrip- 
tions in the form of different observed frequencies. They are reported mainly 
in Chapter 2.1. 

The other level in the model symbolizes relations between concepts. In 
this sense the AaO paradigm approximates complex cognitive phenomena. The 
fundamental assumption made here is that knowledge of the direction of the ac- 
tions is of central importance for a behavioural-fscientific) analysis of the re- 
lations between concepts. The analysis result related to this level is described 
in this report mainly in Chapters 2. 2. 2 - 2. 2. 5. 

The hypothesis on which the analysis on the next level is based is the follo- 
wing: The central importance of directed activity emerges from the verb's func- 
tion in the determination of the nature of the AaO paradigm. Nouns (agents, ob- 
jects) in the clause form distinct units (mneme) that are operationalized by 
means of verbs that lose their meaning, i, e. they are stored as abstract rela- 
tions between nouns. Based on this hypothesis selected objects are in Chapter 
2. 3 transformed to empirically specified concepts. 

On the last level in our psycholinguistic process model the assumption is 
made that the concepts (based on observable clauses) that have been created 
consist of empirical evidence, on which plans are developed and functionalized, 
i. e. become strategies. The result analyses concerning this level are descri- 
bed in Chapter 2. 3, primarily Table 5. 

The result of the cognition -psychological analysis will now be utilized to 
make explicit how researchers perceive problems and by which methods they 
try to solve problems, i. e. achieve scientific goals. It must be possible to for- 
malize each problem, i. e. we must be able to formulate hypotheses. It must 
be possible to give methods a concrete form and instrumentalize goals, i.e. 
we must be able to develop techniques by means of which scientific goals can 
be achieved. 'A more detailed model and description may be found in B. Bier- 
schenk, 1974, pp. 13-25). 

The results presented in points 1-5 above will now be introduced into this 
model. 

The researcher's plan for solving his information problem contains inten- 
tions and goal notions, plus an idea of which means can be used to achieve 
goals, i.e. means-goal hierarchies. The intention is to get in principle two ty- 
pes of information '1) for demarcation of concepts and '2) about research me- 
thods. 

The strategy 'means) that has been designed for the first type is problem 
discussion (discussion seminars, project meetings, informal literature semi- 
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nars). The strategy designed for the second type is to a certain extent biblio- 
graphic information-seeking and visits to international symposiums. But pri- 
marily tutors and fellow-researchers are asked. Different strategies are used 
for getting information about demarcation of concepts and information on re- 
search methods respectively. Since both types of information are negatively 
related to each other, we can draw the conclusion that the information- seeking 
strategy used is related to the type of information sought. 

Instrumentalization, i.e. the technical systems available for channeling in- 
formation, is used to a certain extent to obtain bibliographic references, i. e 
information about information. But the actions that form the building stones of 
the researchers' information- seeking strategies express a neutral to negative 
attitude. 
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5. APPENDICES 

i. Mean values and standard deviations for six concept clusters. 
2. Probable group affiliation and discriminant values. 
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Table 7. Mean values and standard deviations for 14 agent clusters, 
6 concept clusters and 3 assessment dimensions 
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Agent clusters 

1 Social -psychologically oriented researchers 

2 Pupil-oriented researchers 

3 Language -oriented researchers 

4 Science -oriented researchers 

5 Cognition -psychology-oriented researchers 

6 Researchers interested in methodological 
problems 

7 Researchers interested in programmes for 
applying influence 



8 Linguistically-oriented researchers 

9 Humanistically-oriented researchers 

10 Researchers interested in sozialization 

11 Dissemination of information 

12 Type of information and transference 
of information 

13 We 

14 Person 
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For description of agent clusters, cf Table 7. 
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Table 9. Power dimension with poles strong-weak 
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Table 10. Activity dimension with poles active -passive 
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For description of agent clusters, cf Table 7. 

P{G IX} : Probability that a member of the pre- 
dicted group really is a member of that group 
provided that the actual group membership 
is known. 
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